Simplify fsst_compress_offsets_overflow_i32 test with empty compressor by joseph-isaacs · Pull Request #8158 · vortex-data/vortex

joseph-isaacs · 2026-05-29T15:20:43Z

Summary

Refactor the fsst_compress_offsets_overflow_i32 test to use an empty FSST compressor instead of training one on random data. This simplification:

Reduces memory allocation from ~5 GiB to ~3.4 GiB by eliminating the random data pool and using deterministic escape-based expansion (2x input size) instead of incompressible random data (~1x input size).
Removes randomness by replacing StdRng and random pool generation with a single reused buffer of repeated bytes, making the test deterministic and faster.
Maintains regression coverage by exercising the same output-offset overflow path. The empty compressor's escape factor (2x + 7 bytes) is the worst-case FSST expansion, so this is the cheapest way to reach the i32::MAX boundary while still crossing it.
Adds explicit assertion that compressed output actually exceeds i32::MAX, preventing silent test degradation if FSST's escape behavior changes.

The test still allocates ~1.1 GiB of input and ~2.25 GiB of output, so it remains gated to CI and respects VORTEX_SKIP_SLOW_TESTS.

Testing

Existing test infrastructure covers this change. The test is gated to CI runs and validates that the regression (i32 offset overflow in FSST output) is properly exercised by asserting the compressed byte count exceeds i32::MAX.

https://claude.ai/code/session_01212r9Sii7DxuVZ1UJ9oqx1

The fsst_compress_offsets_overflow_i32 regression compressed ~2.5 GiB of random, incompressible data with a trained compressor to push cumulative FSST output past i32::MAX. Profiling the debug build (samply) showed ~90% of runtime in fsst::Compressor::compress_into and ~18 s in per-byte RNG pool generation. Use an empty FSST compressor instead: with no symbols every byte is emitted as a two-byte escape, so output is deterministically 2x the input. That crosses the i32::MAX boundary with only ~1.1 GiB of input (no random data needed), which is the cheapest possible way to reach the boundary since escapes are FSST's worst-case expansion. The test now also asserts the actual compressed byte size exceeds i32::MAX so it cannot silently stop covering the regression. Measured (debug, single run): 307 s -> 186 s (~1.65x), peak memory ~5 GiB -> ~3.4 GiB, RNG generation eliminated. Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

AdamGS · 2026-05-29T15:24:44Z

-/// The input is built with [`VarBinBuilder<i64>`] so the input itself does not panic, which
-/// confirms the overflow is on the FSST output side. After the fix the test must succeed
-/// with the row count preserved.
+/// We force the output past [`i32::MAX`] with an empty FSST compressor: it has no symbols, so


is this test the best documented part of the codebase?

codspeed-hq · 2026-05-29T15:26:51Z

Merging this PR will not alter performance

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

✅ 1275 untouched benchmarks

_{Comparing claude/great-feynman-EqrcJ (579afe6) with develop (73454db)}

Give fsst_compress_offsets_overflow_i32 priority 100 so nextest schedules it at the start of the workspace run, mirroring the existing compress_large_int override. The test still takes ~3 minutes; dispatching it first lets its latency overlap with the rest of the suite instead of trailing at the end as the long pole. Verified locally with nextest 0.9.137: the filter selects exactly this test, and a controlled --test-threads=1 experiment confirmed priority=100 moves a normally-last test to first in dispatch order. Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

AdamGS reviewed May 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify fsst_compress_offsets_overflow_i32 test with empty compressor#8158

Simplify fsst_compress_offsets_overflow_i32 test with empty compressor#8158
joseph-isaacs wants to merge 2 commits into
developfrom
claude/great-feynman-EqrcJ

joseph-isaacs commented May 29, 2026

Uh oh!

AdamGS May 29, 2026

Uh oh!

codspeed-hq Bot commented May 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

joseph-isaacs commented May 29, 2026

Summary

Testing

Uh oh!

AdamGS May 29, 2026

Choose a reason for hiding this comment

Uh oh!

codspeed-hq Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codspeed-hq Bot commented May 29, 2026 •

edited

Loading